perf[array]: bool filter kernel optimisation#7125
Conversation
Merging this PR will improve performance by ×6.1
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_opt_canonical_into[(1000, 10)] |
187.9 µs | 225.3 µs | -16.6% |
| ⚡ | Simulation | density_sweep_dense_runs[0.001] |
578.7 µs | 44.5 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.005] |
578.4 µs | 44.5 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.01] |
578.4 µs | 44.5 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.02] |
578.6 µs | 44.5 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.05] |
579.2 µs | 44.5 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.1] |
582 µs | 44.6 µs | ×13 |
| ⚡ | Simulation | density_sweep_dense_runs[0.5] |
713.3 µs | 45.6 µs | ×16 |
| ⚡ | Simulation | density_sweep_dense_runs[0.95] |
1,539.9 µs | 49.6 µs | ×31 |
| ⚡ | Simulation | density_sweep_dense_runs[0.9999] |
229.1 µs | 45.7 µs | ×5 |
| ⚡ | Simulation | density_sweep_dense_runs[0.999] |
235.6 µs | 46 µs | ×5.1 |
| ⚡ | Simulation | density_sweep_dense_runs[0.99] |
457.2 µs | 47.8 µs | ×9.6 |
| ⚡ | Simulation | density_sweep_dense_runs[0.9] |
2,716.8 µs | 49.1 µs | ×55 |
| ⚡ | Simulation | density_sweep_random[0.01] |
50.7 µs | 45.5 µs | +11.31% |
| ⚡ | Simulation | density_sweep_random[0.02] |
65.1 µs | 57.1 µs | +14.05% |
| ⚡ | Simulation | density_sweep_random[0.05] |
106.1 µs | 86.8 µs | +22.32% |
| ⚡ | Simulation | density_sweep_random[0.1] |
174.1 µs | 41.5 µs | ×4.2 |
| ⚡ | Simulation | density_sweep_random[0.5] |
708.1 µs | 45.5 µs | ×16 |
| ⚡ | Simulation | density_sweep_random[0.95] |
1,449.1 µs | 49.9 µs | ×29 |
| ⚡ | Simulation | density_sweep_random[0.9999] |
219.2 µs | 45.8 µs | ×4.8 |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ji/bool-filter-optimized (afac7dd) with develop (251e603)
|
This PR has been marked as stale because it has been open for 30 days with no activity. Please comment or remove the stale label if you wish to keep it active, otherwise it will be closed in 7 days |
|
This PR has been marked as stale because it has been open for 14 days with no activity. Please comment or remove the stale label if you wish to keep it active, otherwise it will be closed in 7 days |
23cdf5f to
1522ebb
Compare
7b91127 to
8887220
Compare
joseph-isaacs
left a comment
There was a problem hiding this comment.
I cannot approve but I do approve this
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.048x ➖ datafusion / vortex-file-compressed (1.048x ➖, 0↑ 2↓)
|
File Sizes: PolarSignals ProfilingFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.013x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.996x ➖, 0↑ 0↓)
datafusion / parquet (0.996x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.077x ➖, 0↑ 3↓)
duckdb / vortex-compact (1.012x ➖, 0↑ 1↓)
duckdb / parquet (1.087x ➖, 0↑ 4↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeFile Size Changes (2 files changed, -0.0% overall, 0↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.946x ➖, 11↑ 2↓)
datafusion / vortex-compact (1.018x ➖, 4↑ 6↓)
datafusion / parquet (0.926x ➖, 8↑ 1↓)
datafusion / arrow (0.860x ✅, 18↑ 0↓)
duckdb / vortex-file-compressed (0.902x ➖, 10↑ 0↓)
duckdb / vortex-compact (0.945x ➖, 4↑ 0↓)
duckdb / parquet (0.965x ➖, 3↑ 0↓)
duckdb / duckdb (0.934x ➖, 3↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMEFile Size Changes (18 files changed, -0.0% overall, 0↑ 18↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.033x ➖, 0↑ 6↓)
datafusion / vortex-compact (1.025x ➖, 0↑ 3↓)
datafusion / parquet (1.010x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.007x ➖, 7↑ 2↓)
duckdb / vortex-compact (1.013x ➖, 5↑ 4↓)
duckdb / parquet (1.017x ➖, 0↑ 2↓)
duckdb / duckdb (1.011x ➖, 0↑ 2↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.007x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.120x ➖, 0↑ 1↓)
datafusion / parquet (1.070x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.933x ➖, 1↑ 0↓)
duckdb / vortex-compact (1.038x ➖, 0↑ 0↓)
duckdb / parquet (1.012x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.015x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.024x ➖, 0↑ 0↓)
duckdb / parquet (0.999x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsFile Size Changes (2 files changed, -0.0% overall, 0↑ 2↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 0.912x ➖ unknown / unknown (0.949x ➖, 11↑ 1↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.035x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.046x ➖, 0↑ 0↓)
datafusion / parquet (1.030x ➖, 0↑ 0↓)
datafusion / arrow (1.046x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.031x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.032x ➖, 0↑ 0↓)
duckdb / parquet (1.018x ➖, 0↑ 0↓)
duckdb / duckdb (1.023x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.022x ➖, 0↑ 1↓)
datafusion / parquet (1.017x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.986x ➖, 6↑ 2↓)
duckdb / parquet (1.006x ➖, 0↑ 0↓)
duckdb / duckdb (1.001x ➖, 5↑ 1↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (201 files changed, -0.0% overall, 0↑ 201↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.065x ➖, 1↑ 3↓)
datafusion / vortex-compact (0.863x ➖, 3↑ 0↓)
datafusion / parquet (1.092x ➖, 0↑ 5↓)
duckdb / vortex-file-compressed (1.020x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.004x ➖, 0↑ 0↓)
duckdb / parquet (1.033x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: CompressionVortex (geomean): 1.004x ➖ unknown / unknown (1.011x ➖, 1↑ 7↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.941x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.963x ➖, 2↑ 1↓)
datafusion / parquet (1.000x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.067x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.141x ➖, 0↑ 2↓)
duckdb / parquet (1.017x ➖, 0↑ 0↓)
Full attributed analysis
|
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
926a88a to
af0ee1e
Compare
|
I have added handling for non 0 offsets for byte_compress method |
Iterate bit buffers instead of indices or slices when filtering boolean arrays